Skip to content

docs: add semble→Rust reference analysis under .please/docs/references#35

Merged
amondnet merged 3 commits into
mainfrom
amondnet/reference
Jun 19, 2026
Merged

docs: add semble→Rust reference analysis under .please/docs/references#35
amondnet merged 3 commits into
mainfrom
amondnet/reference

Conversation

@amondnet

@amondnet amondnet commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a reference analysis document for MinishLab/semble — the upstream Python library that csp ports — mapped to the Rust rewrite (crates/csp + crates/csp-cli) per ADR-0003.

Analyzed at upstream semble 136b6f7 and Rust port 2f2baa2.

Changes

  • .please/docs/references/semble.md — module-by-module analysis: pipeline overview, semble→Rust module map, per-module deep dives (algorithms + Rust idioms), load-bearing constants table (semble vs Rust), divergences/drift section.
  • .please/docs/references/index.md — index for reference analyses, structured to scale as more upstream libraries are added.
  • .please/INDEX.md — registered the docs/references/ directory with a link to the new index.
  • CLAUDE.md — added a "Reference Analyses" entry under the Project Knowledge section.

Key findings surfaced

  • TD-002: search.rs ranking still uses inline stubs (apply_query_boost identity, rerank_top_k saturation-only); the full ranking::{boosting,penalties} modules are ported but unwired — mirroring the TS source. Search-ranking parity is fixture-level only.
  • Tree-sitter grammar coverage: Rust port uses a curated ~14-grammar static set vs upstream's tree_sitter_language_pack (≈all languages); unsupported languages fall back to line chunking.
  • Upstream drift: semble changed the desired chunk length to 750, but csp (TS + Rust) still uses 1500.

Notes

  • Documentation only — no source or test changes.
  • No linked issue.

Summary by cubic

Adds a semble→Rust reference analysis under .please/docs/references with an index and links, documenting the port, algorithms, constants, divergences, and sync baselines. Highlights key gaps: ranking is stubbed in search.rs, Rust uses a curated grammar set, and upstream chunk length is 750 while the port remains 1500.

  • New Features
    • .please/docs/references/semble.md — module map, pipeline, algorithms, constants, and drift.
    • .please/docs/references/index.md — references index with a documents table, sync baselines, and how-to add analyses.
    • .please/INDEX.md — updates the references entry to point to the new index.
    • CLAUDE.md — adds a “Reference Analyses” section.

Written for commit 0833a21. Summary will update on new commits.

Summary by CodeRabbit

  • Documentation
    • Added a new “Reference Analyses” section with guidance that these materials are reference-only (not contracts)
    • Created reference documentation for the semble library, including end-to-end mapping, pipeline details, and module-by-module parity notes
    • Documented how to add new reference analysis documents (pin upstream commit, module mapping, and registration)
    • Updated the documentation index and adjusted a “Directory Map” entry’s descriptive text

Adds a module-by-module analysis of MinishLab/semble (upstream Python library)
mapped to the Rust port under crates/csp (lib) + crates/csp-cli (bin), per
ADR-0003. Analyzed at upstream semble 136b6f7 and Rust port 2f2baa2.

Files added/modified:
- .please/docs/references/semble.md: pipeline overview, semble→Rust module map,
  per-module deep dives (algorithms + Rust idioms), load-bearing constants table,
  and divergences/drift section.
- .please/docs/references/index.md: index for reference analyses, structured to
  scale as more upstream libraries are added.
- .please/INDEX.md: registered the docs/references/ directory.
- CLAUDE.md: added "Reference Analyses" entry under Project Knowledge.

Key findings surfaced:
- TD-002: Rust search.rs ranking stubs (apply_query_boost identity,
  rerank_top_k saturation-only) — ranking::boosting/penalties ported but
  unwired; mirrors TS source. Search-ranking parity is fixture-level only.
- Tree-sitter grammar set is a curated ~14-grammar static list in Rust vs
  upstream's tree_sitter_language_pack (≈all languages).
- Upstream drift: semble changed desired chunk length to 750; csp (TS + Rust)
  still uses 1500.
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: ff124b16-44aa-48c7-9c80-b1cd6eb3e6f9

📥 Commits

Reviewing files that changed from the base of the PR and between d286620 and 0833a21.

📒 Files selected for processing (1)
  • .please/docs/references/semble.md
✅ Files skipped from review due to trivial changes (1)
  • .please/docs/references/semble.md

📝 Walkthrough

Walkthrough

Adds a new .please/docs/references/ documentation section containing a module-by-module reference analysis of the MinishLab/semble Python library and its Rust port under crates/csp/crates/csp-cli. Updates CLAUDE.md and .please/INDEX.md to register the new section, and populates index.md with a documents table, sync baselines, and instructions for adding future analyses.

Changes

Semble Reference Analysis Documentation

Layer / File(s) Summary
Reference docs index and registry wiring
.please/INDEX.md, .please/docs/references/index.md, CLAUDE.md
Updates the directory map description for docs/references/, creates the Reference Analyses index page with a Documents table, Sync baselines table, and how-to guide, and adds the new section entry in CLAUDE.md.
Semble module-by-module reference analysis
.please/docs/references/semble.md
Adds the full semble analysis: metadata header with pinned commit hashes, pipeline overview diagram, module map table, deep-dives for all subsystems (types, tokenization, AST chunking, file walking, dense/sparse retrieval, index orchestration, hybrid RRF, ranking, MCP, CLI), load-bearing constants table, divergences and drift notes, refresh instructions, and related links.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 Hoppity-hop through modules galore,
A map of the semble from Python to Rust's core,
Constants and drift notes pinned side by side,
With sync baselines logged as a trustworthy guide,
The warren of docs grows richer today!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the primary change: adding comprehensive reference analysis documentation for semble→Rust module mapping under .please/docs/references/.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch amondnet/reference

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production

codacy-production Bot commented Jun 19, 2026

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

이번 풀 리퀘스트는 MinishLab/semble 라이브러리의 Rust 포팅(crates/csp)에 대한 상세 모듈별 분석 문서(semble.md)와 참조 분석 인덱스 문서(index.md)를 추가하고, 관련 링크를 INDEX.md 및 CLAUDE.md에 반영하는 변경 사항을 담고 있습니다. 리뷰 의견으로는 semble.md 문서 하단에 정의되지 않은 마크다운 링크 참조([upstream-semble-sync-baseline], [rust-rewrite-track-status])가 존재하여 링크가 깨질 수 있으므로, 이를 일반 텍스트 형식으로 수정할 것을 권장하는 피드백이 제시되었습니다.

Comment thread .please/docs/references/semble.md Outdated
@codecov

codecov Bot commented Jun 19, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found and verified against the latest diff

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread .please/docs/references/semble.md Outdated
@amondnet amondnet merged commit 6e31708 into main Jun 19, 2026
5 checks passed
@amondnet amondnet deleted the amondnet/reference branch June 19, 2026 08:03
@pleaseai-bot pleaseai-bot Bot mentioned this pull request Jun 19, 2026
@pleaseai-bot pleaseai-bot Bot mentioned this pull request Jun 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant